Optimal unambiguous discrimination between subsets of non-orthogonal quantum 
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It is known that unambiguous discrimination among non-orthogonal but linearly independent 
quantum states is possible with a certain probability of success. Here, we consider a variant of 
that problem. Instead of discriminating among all of the different states, we shall only discrim- 
inate between two subsets of them. In particular, for the case of three non-orthogonal states, 
{|í/>i), \ip2), we show that the optimal strategy to distinguish from the set {lífe}, \ip3)} has 

a higher success rate than if we wish to discriminate among all three states. Somewhat surprisingly, 
for unambiguous discrimination the subsets need not be linearly independent. A fully analytical 
solution is presented, and we also show how to construct generalized interferometers (multiports) 
which provide an optical implementation of the optimal strategy. 

PACS numbers: PACS:03.67.-a,03.65.Bz,42.50.-p 



I. INTRODUCTION 

According to the quantum theory of measurement, it is 
impossible to unambiguously discriminate between non- 
orthogonal quantum states with unit success probability. 
If, however, we settle for less and don't require that we 
succeed every time, then unambiguous discrimination be- 
comes possible. This procedure uses a non-unitary opera- 
tion that maps the non-orthogonal states onto orthogonal 
ones, and these can then be discriminated without error 
using a Standard von Neuman measurement. Although 
such an operation will always have a certain probability 
of failure, we can always teli whether or not the desired 
transformation has succeeded. This allows us to achieve 
unambiguous discrimination. When the attempt fails, we 
obtain an inconclusivc answer. The optimal strategy for 
accomplishing this is the one that minimizes the average 
probability of failure. 

The problem of unambiguously distinguishing be- 
tween two non-orthogonal states was first considered by 
Ivanovic JÏJ , and then subsequently by Dieks and Peres 
H . These authors found the optimal solution when the 
two states are being selected from an ensemble in which 
they are equally likely. The optimal solution for the 
situation in which the states have different weights was 
found by Jaeger and Shimony Q. We proposed an opti- 
cal implementation of the optimal procedure along with 
a more compact rederivation of the general results and 
also showed that the method is useful in other areas of 
quantum information processing Q such as, for exam- 
ple, entanglement enhancement Q. State discrimination 
measurements have been performed in laboratory, first 
by Huttner, et. al. [Q and, more recently, by Clarke, et 
al. Q . Both used the polarization states of photons to 
represent qubits. The case of three states was examined 
by Peres and Terno (9). It was subsequently extended to 
the general problem of discriminating among N states. 
Chefles [n0[ found that N non-orthogonal states can be 



probabilistically discriminated without error if and only 
if they are linearly independent. Chefles and Barnett |ï^] 
solved the case in which the probability of the procedure 
succeeding is the same for each of the states. Duan and 
Guo ]ï^| considered general unitary transformations and 
measurements on a Hilbert space containing the states 
to be distinguishcd and an ancilla, which would allow 
one to discriminate among N states, and derived matrix 
inequalities which must be satisfied for the desired trans- 
formations to exist. In our previous paper Jl3| , we pre- 
sented the necessary conditions for optimal unambiguous 
discrimination and used them to derive a method for im- 
plementing the optimal solution. For the case of three 
states, we presented optical networks that accomplish 
this. One can also consider what happens if the dis- 
crimination is not completely unambiguous, i. e. if it is 
possible for errors to oceur, and this was done by Chefles 
and Barnett [Q . For an overview of the state-of-the-art 
on state discrimination see the excellent recent review 
article by Chefles |ïq ]. 

In these works discrimination among all of the states 
was considered. In the present paper, we consider a vari- 
ant of that problem. Instead of discriminating among 
all states, we ask what happens if we just want to dis- 
criminate between subsets of them. A motivation to con- 
sider this variant comes from its application to comparing 
strings of qubits in order to find out if they are identical 
or not which is certainly one of the bàsic tasks in quan- 
tum information processing. In particular, if there are 
three non-orthogonal states, {|"0i), ^2), IV^)}; we wish 
to find the optimal strategy to unambiguously distin- 
guish from the set {IV^), l^)}- We refer to this 
problem as unambiguous quantum state filtering. In this 
context we should note that recently an analytical so- 
lution has been found to the following closely related 
problem. Instead of unambiguously distinguishing be- 
tween two complementary subsets of an arbitary num- 
ber N of non-orthogonal quantum states, occupying a 
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two-dimensional Hilbert space, errors are allowed but the 
probability of crroneously assigning the state to one of 
the substes is minimized [^6|. The term "quantum state 
filtering" has been introduced there for the case when one 
of the subsets contains one state and the other contains 
all of the remaining N — 1 states. Here, we shall present 
the analytical solution for the case of the other possi- 
ble discrimnation strategy, namely that of unambiguous 
quantum state filtering. 

The paper is divided into six sections. In Section II, 
based on simple but rigorous arguments, we present the 
optimal analytical solution to the problem. In Section 
III, we compare these optimal failure probabilities for two 
different procedures: discrimination between \ipx) and 
{j^), 1^3)} an d discrimination among all three states. 
We find that the failure probability for the first proce- 
dure is smaller than that for the second. In Section IV, 
we propose a possible experimental implemcntation using 
the method proposed in our previous paper JT^j , which 
uses a single-photon representation of the quantum states 
and an optical multiport together with photon detection 
at the output ports to implement the procedure. A brief 
discussion and conclusions are given in Section V. Finally, 
in the Appendix, we present an alternative derivation, 
based on the method of Lagrange multipliers, to obtain 
the results of Section II. The method closely parallels the 
techniques used for unambiguous discrimination between 
all states. 



II. DERIVATION OF THE OPTIMAL 
SOLUTION 

Suppose we are given a quantum system prepared in 
the state |^), which is guaranteed to be a member of the 
set of three non-orthogonal states {IV'i): IV2), IV3)}, but 
we do not know which one. We want to find a procedure 
which will teli us that was prepared in l^i), or will 
teli us that \tp) was prepared in one of {\ip2), IV^)}- That 
is, the procedure can distinguish from {(V^), | ^3) } ■ 
We also want this procedure to be error-free, i. e. the 
procedure may fail to give us any information about the 
state, and if it fails, it must let us know that it has, but 
if it succeeds, it should never give us a wrong answer. 
We shall refer to such a procedure as quantum state fil- 
tering without error. We find that, in contrast to the 
unambiguous state discrimination problem, this will be 
possible even if \ipi) is not linear ly independent from the 
set{|V> 2 >,|V>3>}. 

If the states are not orthogonal then, according to the 
quantum theory of measurement, they cannot be discrim- 
inated perfectly. In other words, if we are given \ipi), we 
will have some probability pi to determine what it is suc- 
cessfully and, correspondingly, some failure probability, 
qi = 1 — pi, to obtain an inconclusive answer. If we de- 
note by r\i the a priori probability that the system was 
prepared in the state \ipi}-, the average probabilities of 



success and of failure to distinguish the states are 



p = ^2vtPt 

i 

Q = ^2vtQi 



(2.1) 



respectively. Our objective is to find the set of {pi} that 
maximizes the probability of success, P. 

The procedure we shall use is a "generalized measure- 
ment", which can be described as follows. Let JC denote 
a total Hilbert space, which is the direct sum of two sub- 
spaces, JC = TL A. The space Tí is a three-dimensional 
space that contains the vectors and A is an auxil- 
iary space. The input state of the system is one of the 
vectors \ipi), which is now a vector in the subspace 7ï of 
the total space JC, so that 



(2.2) 



A unitary transformation, U , which acts in the entire 
space JC is now applied to the input vector, resulting in 
the state \ipf) uti which is given by 



|V>. 



+ 



(2.3) 



where, in our case, IV4) can always be unambiguously 
distinguished from the set IV^)}- Then a measure- 

ment is performed on \ip!f}out that projects \ip^)out either 
onto Itp'j) or (by construction, they are in orthogonal 
subspaces). If it projects \ipf) ut onto |V>-), the proce- 
dure succeeds, because can always be distinguished 
from {IV2)) IV^)}- The probability to get this outcome, 
if the input state is is 



Pi = (V'M 
If the measurement projects 



(2.4) 



cedure fails. 



out onto \4>i), the pro- 
The probability of this outcome is 



qi = l~Pi = {<t>i\4>i)- 



(2.5) 



The nature of the problem we are trying to solve im- 
poses a number of requirements on the output vectors. 
The condition that be distinguishable from \ip' 2 ) and 
IV3) requires that 



(2.6) 



These lcad to conditions on the failure vectors, \4>i). Tak- 
ing the scalar product between \wi) ut an d the other two 



output states and using Eq. (2.6) and the fact that U is 
unitary leads to the conditions 



(V'l 1^2) 

(V'l 1^3) 



(2.7) 



Our objectiv e is to find the optimal \%pÇ) and \<pi) which 
satisfy Eqs. ( |2.4| )- (2.7) and also give the maximum suc- 
cess probability P. 
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Let us now consider the failure vectors. If they were 
linear ly independent, we could apply a state discrimina- 
tion procedure to them [ fÏ0| . That means that if our 
original procedure fails, and we end up in the failure 
space, A, then we still have some chance of determin- 
ing what our input state was. This clearly implies that 
our original procedure, which led to the vectors \ip'), was 
not optimal, because that process followed by another 
on the failure vectors would lead to a higher probability 
of distinguishing \ïpi) from |^ 2 ) and \tf>3). Therefore, the 
optimal procedure should lead to failure vectors to which 
we cannot successfully apply a state discrimination pro- 
cedure, implying that they are linear ly dependent. In 
fact, we will now prové that for optimal discrimination 
they must be collinear, by demonstrating that the con- 
trary leads to contradiction. To this end, we assume that 
we have achieved optimal unambiguous discrimination of 
\tpi) from 1^2) and l^) but the failure vectors are not 
collinear. Then at least one of the two failure vectors, 
102), | çí>3 } , will have a component in the direction that 
is perpendicular to \4>\). We can set up a detector pro- 
jecting onto this direction and a positive outcome of the 
measurement (a click of the detector) will teli us that 
our input state was not l^i) but one of the other two 
states. Thus, contrary to our assumption that our pro- 
cedure has been optimal, further distinction is possible. 
Hence, the failure vectors must be collinear for optimal 
discrimination. 

We shall now explore the consequences of this conclu- 
sion. Since \<j)i) (i — 1, ...,n) are collinear, the failure 
space, A, is one dimensional. If \u) is the basis vector 
spanning this Hilbcrt space we can write the failure vec- 
tors as \4>i) — yfqie Xi \ u). Substi tuti ng this representation 
of the failure vectors into Eq. (2.7), we find that 



<7l<?2 = | <^0X I "02> | 2 , 
9l<?3 = | ("01 1-03> | 2 - 



(2.8) 



These two conditions are a consequence of unitarity 
and imply that only one of the three failure probabil- 
ities can be chosen independently. If we chose qi as 
the independent one we can express the other two as 
<?2 = | <-0i |"02> | 2 /Qi and q 3 = \{ipi\i/; s )\ 2 /qi. If we intro- 
duce the notation 0%j = (^Ji\ipj) then, with the help of 
these two equations, the average failure probability can 
be written explicitly as 



Q = £ 



VíQí 



m\o 12 \ 2 + V3 \o 13 \ 2 



(2-9) 



If we further introduce the notation A = 772 IO12 1 2 + 
?73 IO13 1 2 for the frequently occuring average overlap then, 
from the condition 



dqi 



0. 



(2.10) 



we find the optimal value of q\ to be 



Çl 



VA/ 



(2-11) 



This value, however, cannot always be realized. For it 
to be tru e, there must be a unitary transformation, from 
Eq. (2J3), that takes to \ipj) ut which, together with 
the one-dimensionality of the failure space yields 



\ipj)out = Wj) + VQj\e iXí \u). 



(2-12) 



Here we have that (tlij\u) = 0, {ip[\ipj) — for j — 2,3, 
and the phase factors are fixed by the requirement (cf. 
Eq. (O)) that 



(V'il^· 



J(Xj-Xl) 



-3/ ~ V9Ï* e " 
for j = 2, 3. These equations imply that 



(2-13) 



(2.14) 



This set of equations can only be true if the matrix M, 
where 



M jk = (^#fe) -V^ el(Xk ~ X3 ^ 



(2.15) 



is positive semidefinite, as discussed in detail in Ref. [|Ï3| 
Using again Ojk — (tjjj\ipk), M can be expressed as 



M 






IQial 2 



^23 
1 - 



(2.16) 



in 



Clearly, this matrix will be positive semidefinite if < 
qi < 1 , and if the 2x2 submatrix is also positive semidef- 
inite. This will be true if both the trace and determinant 
of the submatrix are greater than or equal to zero. Pos- 
itivity requires that the diagonal matrix elements of the 
submatrix be non-negative, so that it must be true that 
qi > |Oi2 1 and qi > IO13I . Without loss of generality, 
we can assume that | O12 1 > | O13 1 by simply arranging 
the states in set 2 in the order of decreasing overlaps 
with l^i). Doing so and imposing the condition that 
qi > | O12 1 guarantees that the condition qi > \Oi 3 \ is 
also satisfied, and together they imply that the trace is 
greater than or equal to zero. 

The condition that the determinant be non-negative 
gives us a lower bound on q\ , 



'li 



> 



\o 12 



\o 



13 



(O12O23O31 + O13O32O21) 



1-|0 23 | 2 

(2.17) 

We want to intèrpret this inequality, in particular, we 
want to find what the right-hand side is equal to. In or- 
der to do so, we shall find the projection operator, P23, 
that projects onto the subspace spanned by 1^2 and ^3. 
One of the basis vectors in this subspace can be chosen to 
be \ip2) and, using the Gram-Schmidt orthogonalization 
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method, the other is defined as the (normalized) orthog- 
onal component of \ip3), 



1 



VI-IO23I 2 



- Onil»)). 



(2.18) 



yielding the average failure probability 
Q = r]i + A. 

(iii) If A< m \ (4 |V>!)| 2 , then 



(2.26) 



This leads to 



P23 = \ih)(ih\ + 1^X^31- 
Let us represent the input state, as 



qi = (ipi\ví), 

q-2 



\Ol2\ 2 



(2.19) 

IVt> + *" <V>ÍVi) 

\tpi), where \ipi) = (1 — P23)\ipi) is the component of |Oi3 1 2 

the input vector that is perpendicular to the subspace ^3 — ^ ^ , 

spanned by ip2 and ip3 an d \ipl) = P23\'4 ) i) is the com- 1 1 

ponen t in that subspace. Then, using Eqs. fl2.18| ) and yielding the average failure probability 



( |2.19[ ), the explicit expression for the paral·lel component 
is given by 



II 0-21 — O23O3I , , 

m) = — ; — ^ 12 m) 



O31 — O32O21 



i-|o 23 | 2 ,r ' i-|0 23 , 

Calculating the norm of this expression yields 



|V> 3 )]. (2.20) 



(MM) 



IO12I 2 + |Oi 3 | 2 - (O12O23O31 + O13O32O21) 



1 



10; 



23 



(2.21) 



which is identi cal t o the right-hand side of Eq. ( 2.17 ). 

Thus, Eq. ( [2.17| ) telis us that the failure probability, 
qi, has a lower bound which is given by the weight of 
\tpi) in the other subspace, 11^23^1 1| 2 = (V'l I-F23 l^i) = 
(tpi \ipi), a result that is intuitively obvious. Clearly, this 
expression is larger than (or at most equal to) |Oi2 1 2 - 
This implies that, because q^ — \O\1\ /<7i, we have 



(12 



< 



|Ois 



mm) 



\o 



12 



< 1, 



(2.22) 



and similar ly for ç 3 . 

We can then distinguish three differ ent regimes of the 
parameters. If the r.h.s. of Eq. (2.11) is greater than 1 

then gi = 1, if it is less than (i/jf |^JJ) then q\ = (V'flV'ï)) 
and in the intermediate range the optimum given by Eq. 
(2.11) is rcalized. This can be summarized as follows. 

(i) if m \(4\4)\ 2 <A<m,thcn 



12 
13 



- Vm/A\o 



= ^m/A\0 13 \ 2 , 
yielding the average failure probability 
Q = 2^A. 

(ii) If A > rji, then 



(2.23) 



(2.24) 



(h 

92 

(p, 



= 1, 



\O u 
\O ia 



(2.25) 



Q = m(4\4) 



(2.27) 



(2.28) 



Equations (2.23)-( 2.28| ) summarize our main rcsults. 
In the intermediate range of the avera ge overlap, A, the 
optimal failure probability, Eq. ( 2.24 ), is achieved by a 
generalized measurement or POVM. Outside this region, 
for very large average overlap, A > 771 , or very small 
average overlap, A < rj i\(4\ 4)\ 2 1 the optimal failure 



probabilities, Eqs. (2.26) and (2.28), are realized by Stan- 
dard von Neumann measurements. For very large A the 
optimal von Neumann measurement consists of projec- 
tions onto and two orthogonal directions whose di- 
rectionality needs not be specified further. A click along 
IV^i) corresponds to failure because it can have its ori- 
gin in any of the two subsets and a click in the orthog- 
onal directions uniquely assigns the input state to the 
set {1^2), iV'a)}- F° r very small A the optimal von Neu- 
mann measurement consists of projections onto |i/>i) and 
two orthogonal directions that are uniquely determincd 
by the requirement that they correspond to two mutu- 
ally exclusive alternatives. One of them is onto | , 0j L ) 
and the other onto the remaining orthogonal direction 

in the subspace of {IV^), IV^)}- A click along \4) cor- 
responds to failure because it can originate from any of 
the input states while a click in any of the alternative 
directions unambiguously assigns the input to one or the 
other of the two mutually exclusive subsets. It is inter- 
esting to observe that the failure space is one dimensional 
for each of the three different optimal measurements in 
the three different regions. At the boundaries of their 
respective regions of validity, the optimal measurements 
transform into one another continuously. Furthermore, 
each of the two von Neumann expressions can be written 
as the arithmetic mean of two terms and the POVM re- 
sult as the geomètric mean of the same two terms. There- 
fore, in its range of validity the POVM performs better 
than any von Neumann measurement. 

In closing this Section we want to point out an inter- 
esting feature of the solution. The results hold true even 
when there is no perpendicular component of the first 
input state, \ip\) = 0, i.e. it lies entirely in the Hilbert 
space spanned by the other two vectors or, in other words, 
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the two sets are linearly dependent. In this case the two 
von Neumann measurements coincide and the range of 
validity of the POVM solution shrinks to zero. A click in 
the detector along the first input vector corresponds to 
failurc - it might originate from either of the two subsets 
- and a click in the detector along the single direction 
orthogonal to it unambiguously identifies the set of the 
other two vectors. 

An alternative derivation of the above results, that is 
based on the method of Langrange multiplicrs, is given 
in the Appcndix. 



So the average probability of failure Q is less than Q' = s. 
(ii) if ^ < s < 1, then 



9i = 1, 
92 = 93 = s 2 , 



1 2 

3 + 3" 



s 2 . 



(3.4) 



These solutions are illustrated and compared to Q' in 
Figure [ÏJ Note that in both cases we have that Q < s = 



III. COMPARISON TO THE CASE WHEN ALL 
STATES ARE DISCRIMINATED 

In this section we want to compare the average prob- 
ability of failure Q of the filtering problem to that of 
distinguishing all three states. Let Q' denote the aver- 
age probability of failure for distinguishing all the states 
{| V'l), |02), |03)}- We can see immediately, that the prob- 
ability of failure to distinguish |0i) from {|02), |03)}, Q, 
should be no larger than Q' . For the latter problem, the 
necessary condition for achieving optimal discrimination 
is 



Long dashed line: Q' = s 



qi O12 O13 
0* l2 92 O23 
OÏ3 °23 93 



= 0. 



(3.1) 



0.5 




,<"'' Short dashed line: Q = ^fs 
Sòlid line: Q = \ + fs 2 



0.5 



FIG. 1: We compare Q and Q' . For < s < we have that 
Q' = s and Q = 2Yls. For ^ < s < 1, we still have that 
Q' = s, but Q = i + is 2 . Note that Q is always smaller than 



When comparing this equation to Eq. (|Al[) , we see that 



instead of a given constant O23 that ap pear s in Eq. (3.1), 
there are the variables r and 8 in Eq. ( |A1[ ). These vari- 
ables are chosen to minimize the average probability of 
failure Q. Thcrefore, Q should be no larger than Q', 
Q < Q'. 

To illustrate this point, we use a simple symmetric 
case, where all of the overlaps between the states are real 
and equal, 



(V'l 1^2) = (V'l 1^3) = (V>2 1^3 



(3.2) 



with < s < 1. We shall also assume that the a priori 
probabilities are equal for all the examplcs in this paper. 
From previous work we know that in this case, the op- 
timal vàlues of the failure probabilities when we wish to 
distinguish among all of the states IJV'i), ^2), |V"3)} are 
qi = s, which implies that Q' — s |Ï3|| . 

For the problem of distinguishing \ tpi) from 
{102), 1^3)}; from the results of Eqs. ( [2.23 ) and 
2Éh, we have (i) if < s < then 



9i 



92 = 93 = — s, 



V2s, 

2 ' 
2y/2 



(3.3) 



Now we shall compare filtering to the problem of dis- 
tinguishing two states {l^i), 102)}, when all the a pri- 
ori probabilities are equal. If we denote by Q" the 
average probability of failure when distinguishing be- 
tween the two states {IV'i) and IV^)}, we know that 
Q" = \0 X2 \ (Refs. §-@). For the case we are con- 
sidering, \Oxi\ = |Oi3| = s, and we see that Q < Q" . 

A second example is more illuminating. The overlaps 
are now given by 



(0l|0 2 ) 
(02 |03) 



(01 103) 
S2, 



Sl, 



(3.5) 



where, for simplicity, si and s 2 are real, < s%,S2 < 1, 
and 

V2 2 

< Si < Sj < S2, and s± < 2s 2 . 

The probabilities of failure for discriminating |0i) 
{|02), 103)} are 



9i 



= V2 Sl , 



92 = 93 = 



V2 



Si, 



and the average failure probability is 



Q 



(3.6) 
from 

(3.7) 
(3.8) 
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From the above equation, we see that when s x is mudi FIG 2: An optical eight-port. The beams are straight lines, 
smaller than s 2 , Q is much smaller than Q' . For example, a suitable beam splitter is placed at each point where two 
when S\ = S2 = g, Q/Q' = 0.47. beams intersect, phase shifters are at one input of each beam 

splitter and at each output. 



IV. OPTICAL REALIZATION 

Now we shall present a scheme for a possible experi- 
mental realization of the optimal discrimination between 
l^i) and {1^2), 1^3)}- The method is similar to one we 
proposed in a previous publication |Q. We shall use 
single photon states to represent the input and output 
states, and an optical eight-port together with photon 
detectors placed at the output ports to realize the uni- 
tary transformation and subsequent measurements. 

Our states will be a single photon split among several 
modes. Each mode will serve as an input to an optical 
eight-port. Recali that the dimension of the total Hilbert 
space is four, so we shall require four modes, and the 
input states will be represented by single photon 
states as 



|^)=^^a]|o), 



(4.1) 



where J2l=i \ d ij\ 2 



1, and àlj is the creation operator 



for the jth mode. We shall require = for i = 1, 2, 3, 
that is, the initial single photon state is sent to the first 
three input ports, and the vacuum into the fourth input 
port. The first three modes correspond to the space, 7i, 
containing the states to be distinguished and the fourth 
mode to the failure space, A. 

In general, an optical 2iV-port is a lossless linear device 
with N input ports and N output ports. Its action on 
the input states can be described by a unitary operator, 
U2N i and physically it consists of an arrangement of beam 
splitters, phase shifters, and mirrors. Since the dimension 
of the input and output states is four, here we shall use 
an eight-port (see Figure ||). 

If we denote the annihilation operators corresponding to 
the input modes of the eight-port by Oj, j = 1, ... ,4, 
then the output operators are given by 



U' 



au, 



(4.2) 



where Mjk are the elements of a 4 x 4 unitary matrix 
M(4). In the Schròdinger picture, the in and out states 
are relat ed by 



\Í>)out = U\i>)i 



(4.3) 



It can be shown jÏ3| that when using single photon states 
representation, the matrix element Mu is the same as the 
matrix element of U between the single-particle states 
|i) = a\\Q) and \l) = aj\0), i.e., 



(i\U\l) = M ü . 



(4.4) 



To design the desired eight-port, we first calculate the 
optimal value of qi. Then from Eq. (2J3) and the fact that 
our failure space is one-dimensional, the vectors \<f>i) are 
given by 



II- 4 ) = 



(4.5) 

where the state 1 1^) denotes one photon state in the fail- 
ure space, which is just one photon in mode 4. Once the 
vectors \4>i) are determined, the inner produets (V'ílV'j) 
(i, j = 1,2,3) are given by 



(4.6) 



We then have to find vectors that satisfy this equa- 
tion. The answer is not unique, and one way of proceed- 
ing is the following. If we define the hermitian matrix L 
to be 



Lij = (lpi\lf}j)in - {<t>i\<t>j), 



(4.7) 



then we note from Eq. (2.7) that L12 = L13 = 0. This 
implies that the simplest choice for |^) is a vector with 
only one nonzero component. Then the vectors |í/> 2 ) and 
^3) will have nonzero components in only their other 
two places. The obvious choice is 



fe=i 




(4.8) 
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In this column vector, the first entry is the amplitude of 
the photon to be in mode 1, the second is the amplitude 
to be in mode 2, etc. Mode 4 corresponds to the failure 
space, A. The vectors arj d IV^) Wli l have nonzero 
components in only their second and third places, and if 
their overlap is real, we can choosc 



where 



/ 


^ 






/P2 COS0 


, m = | 


V 


J 






, (4.9) 



1 -i ( L 23 \ 
- cos 

2 KVpïpïJ 



(4.10) 



This simple c hoice works for the last example in this sec- 
tion (see Eq. ( [í.22| ), below). For the first, somewhat morc 
general, example we are forced to choose the second com- 
ponent of \ipi) to be nonzero and then the first and third 
components of the other two success vectors are different 
from zero. They can be obtained by simply interchanging 
the first and second compon ents i n the above expressions 
of the vectors |^) (see Eq. ( 4.13 ), below). 

Once we have the input and output vectors, the unitary 
transformation, U, which maps the input states onto the 
outp ut s tates then can be chosen, and this, as shown by 
Eq. (4.4), gives the explícit form of M(4). Furthermore, 
M(4) can be factorized as a product of two-dimensional 
U(2) transformations JÏ3| |Ï7| ], and any U(2) transforma- 
tions can be implemented by a lossless beam splitter and 
a phase shifter with appropriate parameters. A beam 
splitter with a phase shifter at one output port trans- 
forms the input operators into output operators as 



«i 

«2 



e z ^sina; e^cosw 



Oi 
«2 



(4.11) 



where a±, ai are the annihilation operators of modes 1 
and 2 respectively, lo describes the reflectivity and trans- 
mittance of the beam splitter, and <j> describes the effect 
of the phase shifter (in the factorization method given 
by M. Reck et al. |17| , the phase shifters described by 
<t> should be placed at the input ports). Therefore, we 
can use appropriate beam splitters, phase shifters and a 
mirror to construct the desired eight-port. 

Finally, photon detection is performed at the four out- 
put ports. We can design the total transformation in 
such a way that if the photon is detected at the first out- 
put port, we claim with certainty that the initial state 
was if the photon is detected at the second or the 
third output port, we claim with certainty that the initial 
state was either or \ip3), but we do not know which 
of these two states it was. If the photon is detected at 
the fourth output port, we obtain no information about 
the input state. 

We shall now consider two examples. The first is more 
general than the second, but the second has the advan- 
tage that it is simple and the eight-port that it requires 



consists of only two 50 — 50 beam splitters. In the first 
example, all of the input vectors have the same over- 
lap, which is given by s, and we shall consider the case 
< s < l/y/2. The o ptim al failure probabilities for this 
case are given in Eq. ( |3.3| ). For the input vectors we shall 
take 



X/in 



2 (1 



V 



/ 



IV2 



-^-*) 1/2 

^(1-*) 1/2 




3 / in 



-75(1 "*) 1/2 

-:}f(i-*) 1/2 




(4.12) 



The output vectors, \tpi) ut — Wi) + \4>í)i can be com- 
puted by the method outlined above. Doing so gives us 



IV>l)c 



IV* 



/ \ 

(1 - V2 S y/ 2 


V (*^) 1/2 / 

( ((1 + s - ,sV2)/2) 1 /2 \ 



IV>: 



3/out — 



((1- S )/2)V2 

(s/V2) 1/2 

( ((1 + s - sV2)/2f/ 2 \ 


-((!-. S )/2)V2 



(4.13) 



Our next step is to determine the transformation, U, that 
describes the eight-port, or, more specifically, the matrix 
M(4) that describes its action in the one-photon sub- 
space. It must satisfy \ipi) ou t = U\ip)i n , and, in addition, 
it must map the vector that is orthogonal to all three 
input vectors, onto the vector that is orthogonal to all 
three output vectors, 



1 

A 





BC J 



M(4) 



(4.14) 



where 



A = [(1- S )(1 + 2 S )] 1 / 2 , 
B = (1-.SV2) 1 / 2 , 
C = {í + s-sV2) 1/2 . 



(4.15) 
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These equations determine M(4) and it is given by 
M(4) = 



c 



c 



3 Vl+2s 
B 



2 B 

3 x/I-ï 











(s\/2) l l 2 \ 



V3(ï+2s) 

1 

(n/2 + 1) (sVI) 1 / 2 (y^- lXsy^) 1 / 2 Q 

V V 3 ( 1+2s ) \/3(1-í) 



A 



/ 



(4.16) 

This matrix can be expressed as the product of three 
matrixes each of which corresponds to a beam splitter. 
In particular, we have triat 



M (4) = T 2>i T 1A T lt2 



(4.17) 



where the matrix T p . ç represents the action of a beam 
splitter that mixes only modes p and q. The 4x4 matrix 
for T p>q can be obtained from that of a 4 x 4 identity 
matrix, /, by replacing the matrix elements I pp and I qq 
by the transmissivity of the beam splitter, t, replacing 
Ipq by the reflectivity, r, and replacing I qp by — r. The 
t rans missivities and reflectivities for beam splitters in Eq. 
(|4.17D are 



?2.4 : 
Ti, 4 : 



t = B 
t - £ 

1 — A 



2(l-s) 



-(«V2) 1/2 



(4.18) 



l+2s 
3 ' 



This constitutes a complete description of the optical 
network that optimally discriminates between in and 
ll^V im |V'3)in} J where these input states are given in Eq. 
( 4.12 ), and it is shown schematically in Figure ||. 




FIG. 3: The eight-port described by Eq. (4.16) can be con 



structed from three beam splitters and a mirror. 

An especially simple network will suffice for our second 
example. The input vectors are 



í vm\ 



1/V3 

V o / 



2 /in 



1/V3 

o 



-l/\/3 
1/2/3 




(4.19) 



These input states have the property that 

in{Í>l\lp2)in = 
in(lp2\lp3)in = 



\/2 

(^1 1^3) in = — , 



(4.20) 



The optimal failure probabilities arc found to b e q± = 2/3 
and (72 = 93 = 1/3. Using Eqs. (3.7) and (3.8) this gives 



Q 



(4.21) 



for the minimum average failure probability of this kind 
of generalized measurement. This is to be compared to 
5/9, the average failure probability of a von Neuman type 
projective measurement. 

The output vectors, \ipi) ut = \ip'i) + \4>í), can again be 
computed by the method outlined previously. Doing so 
gives us 



IV>l>c 



2/ouí 



/ W3\ 



Va/2/3 7 
/ \ 
1/V3 
l/x/3 

( 

-i/Vs 

1/V3 



\^)out 

The matrix M(4) can be chosen to be 
M(4) = 



(4.22) 



/ 1/V2 -1/V2 \ 

10 

-1/2 1/V2 -1/2 

V 1/2 1/a/2 1/2 / 



(4.23) 



and it can be expressed as 

M(4) = T 3A T 1A . (4.24) 

In this case, both T\ A and T3.4 represent 50 — 50 beam 
splitters, and they are given explicitly by 

1 



Tia 

n,4 



t 



V2 



v'2 



V2 
1 

7T 



(4.25) 
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This last example constitutes what is probably the sim- 
plest choice of the set of parameters for a possible exper- 
imental realization. 



V. CONCLUSIONS 

The usual problem considered when trying to unam- 
biguously discriminate among quantum states is to cor- 
rectly identify which state a given system is in when one 
knows the set of possible states in which it can be pre- 
pared. Here we have considered a different problem. The 
set of possible states is divided into two subsets, and we 
only want to know to which subset the quantum state 
of our given system belongs. As this is a less ambitious 
task than actually identifying the state, we expect that 
our probability to be successful will be greater for attain- 
ing this more limited goal. 

We considered the simplest instance of this problem, 
the situation in which we are trying to discriminate be- 
tween a set containing one quantum state and anothcr 
containing two. A method for finding the optimal strat- 
egy for discriminating between these two sets was pre- 
sented, and analytical solutions for particular cases were 
given. In addition, we have shown that if the quantum 
states are single-photon states, where the photon can be 
split among several modes, the optimal discrimination 
strategy can be implemented by using a linear optical 
nctwork. 

These ideas can be extended in a number of different 
ways. One possibility is to consider the situation in which 
one is given N qubits, each of which is in either the state 
\4>i) of 1^2), where these states are not orthogonal. What 
we would like to know is how many of the qubits are in 
the state \ipi)- In order to phrase this problem in a way 
that makes its connection to the problems considered in 
this paper clear, we note that the total set of possible 
states for this problem consists of 2 N states (the states 
are strings of qubits), and this can be divided up into 
the subsets S n , where the members of S n are sequences 
of N qubits in which n are in the state |'0i)· For a given 
sequence of qubits, our problem is to determine to which 
of the sets S n it belongs. Another possibility is to use 
these methods to compare strings of qubits in order to 
find out if they are identical or not. Again, suppose that 
we have strings of N qubits in which each qubit is in one 
of the two non-orthogonal states, \ipi) or |^ 2 ). We are 
given two of these strings and want to know if they are 
the same or not. In this case, our set of possible states 
consists of pairs of strings, and hence has 2 2N members. 
This is divided into two subsets, the first, S equa i, consist- 
ing of pairs of identical A-qubit strings (2 N members) , 
and its complement, S equa i, consisting of everything else. 
Our task, when given two sequences of N qubits, is to 
decide if they are in S equa i or in S equa i ]Ï8fl . More de- 
tailed consideration of these problems remains for future 
research. 
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APPENDIX A: DERIVATION OF THE OPTIMAL 
SOLUTION VIA THE METHOD OF LAGRANGE 
MULTIPLIERS 

In this section, we shall show that by using the method 
of Lagrange multi pliers , we c an derive the conclusions 
contained in Eqs. (2.23)-(2.28) rigorously, starting from 
the fact that for optimal discrimination, the vectors \4>i) 
must be linearly dependent. To express this statement 
in a compact form we definc the positive semidefinite 
matrix C, where CVj = (<pi\<pj). Then, in general, if \4>i) 
(i = 1, . . . , n) are linearly dependent, the determinant of 
matrix C must vanish, A = det(C) = @. With the 
help of Eqs. (2.5) and ( |2.7| ), we can eliminate two of the 
three overlaps from the matrix C and obtain explicitly 
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919293 - r qi - \0 13 \ 
+2|O 12 ||O 13 |rcos(0- 
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a) 



\O l2 

0. 



'93 

(Al) 
3 ) is the re- 



Here 0%j again denotes (ipi\ipj), re 10 = (1 
maining overlap where r and 6 are to be determined from 
the conditions for optimum, and a = — arg(Oi20 13 ). 
Since C is positive semidefinite, all the diagonal subde- 
terminants of A must be non-negative. 

We now wis h t o minimize the average probabili ty o f 
failure Q, Eq. ( |2.l| ), subject to the constraint in Eq. (Al). 
This can be done by minimizing the quantity 



mqi 



AA, 



(A2) 



where À is a Lagrange multiplier. The conditions for 
minimum with respect to r and 8, dQ/dr = and 
dQ/d9 = 0, lead immediately to 



|O 12 ||Oi3|cos(0-a)-gir = O, 
r|Oi 2 ||Oi 3 |sm(0-a) = O. 



(A3) 
(A4) 



The solutions of these equations, corresponding to the 
minimum of Q, are 



6 = a, 



(A5) 



10 



and 



q x r = \O l2 \\0 



13 



(A6) 



Next, we perform the optimization with respect to the re- 
maining variables. Notice that the derivative of Q with 
respect to A returns Eq. (Al). Therefore, we use the 



optimal vàlues of r and 9 in Eq. (Al) and in the condi- 
tions for minimum with respect to the failure probabili- 
ties, dQ/dqi = for i = 1,2,3. After some àlgebra we 
obtain the following set of equations 



= ViQÍ + A(A 12 A 13 

dqi 



A 12 A 13 = 0, 

-2 



dQ 
dq 2 
dQ 
dq 3 



|0 12 | 2 A 13 + |0 13 | 2 A 12 ) 



= t] 2 + AA i3 = 0, 
= m + AA 12 = 0, 



0. 



(A7) 

(A8) 
(A9) 

(A10) 



where Ai 2 and Ai 3 are the diagonal subdeterminants of 
A, 



A12 = qiq 2 - \Oi 2 \ 
Ai 3 = qiq 3 - \0 13 



(All) 
(A12) 



We now have four variables qi,q 2 ,q 3 , and A, and four 
equations, Eqs. (|A7|)-( |A10[ ), to find them. Eq. ( |A7| ) 
telis us that at least one of the diagonal subdeterminants 
vanishes. With no loss of generality w e can assume this 
to be A 12 = 0. Comparing this to Eq. (A10) we see that 



A must be singular. The singularity, however, is tractable 
since the same equation telis us that the product AA 12 is 
finite. Th en it follows from the singular behavior of A and 
Eq. (A9) that the other diagonal subdeterminant also 
vanishes, Ai 3 = 0, but the product AAi 2 als o rema ins 
finite . U sing these finite vàlues from Eqs. ( A9)-( A10 ) in 
Eq. (A8), we can summarize our findings as follows 



A12 = A 13 = 0, (A13) 
which is just equation ( |2.8| ), and 

Viqï - V2\0 12 \ 2 - %|0 13 | 2 + AA 12 A 13 = 0. (A14) 



Multiplying Eq. ( |A9| ) by A 12 (or Eq. ( [A1C| ) by A i3 ) and 
taking into account Eq. (A13) gives that the singularity 
in A is such that AAi 2 Ai 3 = 0. Using this in Eq. (A14) 
we finally obtain 



mil - m\o 



12 



13 



0. 



(A15) 



This is the solution found in Section II, Eq. ( 2.11), a nd 
the rest of Section II follows from here and Eq. (A13). 

For the sake of completeness we also give the expression 
for 1/A, 



'A12A13 
V2V3 



(A16) 



which exhibits no singularity. In fact, 1/A — when 
A12 = A 13 = 0, as expected. Finall y, le t us note that 
Eq. ( |A13j ), which is identical to Eq. (2.S), implies that 
all of the failure vectors, \4>i), are paral·lel to each other, 
i. e. they lie in a space, A, of dimcnsion one. 
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